Spectrum estimation for large dimensional covariance matrices using random matrix theory

نویسنده

  • Noureddine El Karoui
چکیده

Estimating the eigenvalues of a population covariance matrix from a sample covariance matrix is a problem of fundamental importance in multivariate statistics; the eigenvalues of covariance matrices play a key role in many widely techniques, in particular in Principal Component Analysis (PCA). In many modern data analysis problems, statisticians are faced with large datasets where the sample size, n, is of the same order of magnitude as the number of variables p. Random matrix theory predicts that in this context, the eigenvalues of the sample covariance matrix are not good estimators of the eigenvalues of the population covariance. We propose to use a fundamental result in random matrix theory, the Marčenko-Pastur equation, to better estimate the eigenvalues of large dimensional covariance matrices. The Marčenko-Pastur equation holds in very wide generality and under weak assumptions. The estimator we obtain can be thought of as “shrinking” in a non linear fashion the eigenvalues of the sample covariance matrix to estimate the population eigenvalue. Inspired by ideas of random matrix theory, we also suggest a change of point of view when thinking about estimation of high-dimensional vectors: we do not try to estimate directly the vectors but rather a probability measure that describes them. We think this is a theoretically more fruitful way to think about these problems. Our estimator gives fast and good or very good results in extended simulations. Our algorithmic approach is based on convex optimization. We also show that the proposed estimator is consistent.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exponential Weighting and Random-Matrix-Theory-Based Filtering of Financial Covariance Matrices for Portfolio Optimization

We introduce a covariance matrix estimator that both takes into account the heteroskedasticity of financial returns (by using an exponentially weighted moving average) and reduces the effective dimensionality of the estimation (and hence measurement noise) via techniques borrowed from random matrix theory. We calculate the spectrum of large exponentially weighted random matrices (whose upper ba...

متن کامل

Large Dimensional Random Matrix Theory for Signal Detection and Estimation in Array Processing∗

In this paper, we bring into play elements of the spectral theory of large dimensional random matrices and demonstrate their relevance to source detection and bearing estimation in problems with sizable arrays. These results are applied to the sample spatial covariance matrix, R̂, of the sensed data. It is seen that detection can be achieved with a sample size considerably less than that require...

متن کامل

Consistent Estimation of Large - Dimensional Sparse Covariance Matrices

Estimating covariance matrices is a problem of fundamental importance in multivariate statistics. In practice it is increasingly frequent to work with data matrices X of dimension n×p, where p and n are both large. Results from random matrix theory show very clearly that in this setting, standard estimators like the sample covariance matrix perform in general very poorly. In this “large n, larg...

متن کامل

Operator norm consistent estimation of large dimensional sparse covariance matrices

Estimating covariance matrices is a problem of fundamental importance in multivariate statistics. In practice it is increasingly frequent to work with data matrices X of dimension n × p, where p and n are both large. Results from random matrix theory show very clearly that in this setting, standard estimators like the sample covariance matrix perform in general very poorly. In this “large n, la...

متن کامل

Estimation of the sample covariance matrix from compressive measurements

This paper focuses on the estimation of the sample covariance matrix from low-dimensional random projections of data known as compressive measurements. In particular, we present an unbiased estimator to extract the covariance structure from compressive measurements obtained by a general class of random projection matrices consisting of i.i.d. zero-mean entries and finite first four moments. In ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008